26 research outputs found

    Marathon: An open source software library for the analysis of Markov-Chain Monte Carlo algorithms

    Full text link
    In this paper, we consider the Markov-Chain Monte Carlo (MCMC) approach for random sampling of combinatorial objects. The running time of such an algorithm depends on the total mixing time of the underlying Markov chain and is unknown in general. For some Markov chains, upper bounds on this total mixing time exist but are too large to be applicable in practice. We try to answer the question, whether the total mixing time is close to its upper bounds, or if there is a significant gap between them. In doing so, we present the software library marathon which is designed to support the analysis of MCMC based sampling algorithms. The main application of this library is to compute properties of so-called state graphs which represent the structure of Markov chains. We use marathon to investigate the quality of several bounding methods on four well-known Markov chains for sampling perfect matchings and bipartite graph realizations. In a set of experiments, we compute the total mixing time and several of its bounds for a large number of input instances. We find that the upper bound gained by the famous canonical path method is several magnitudes larger than the total mixing time and deteriorates with growing input size. In contrast, the spectral bound is found to be a precise approximation of the total mixing time

    Gerbil: A Fast and Memory-Efficient kk-mer Counter with GPU-Support

    Get PDF
    A basic task in bioinformatics is the counting of kk-mers in genome strings. The kk-mer counting problem is to build a histogram of all substrings of length kk in a given genome sequence. We present the open source kk-mer counting software Gerbil that has been designed for the efficient counting of kk-mers for k≥32k\geq32. Given the technology trend towards long reads of next-generation sequencers, support for large kk becomes increasingly important. While existing kk-mer counting tools suffer from excessive memory resource consumption or degrading performance for large kk, Gerbil is able to efficiently support large kk without much loss of performance. Our software implements a two-disk approach. In the first step, DNA reads are loaded from disk and distributed to temporary files that are stored at a working disk. In a second step, the temporary files are read again, split into kk-mers and counted via a hash table approach. In addition, Gerbil can optionally use GPUs to accelerate the counting step. For large kk, we outperform state-of-the-art open source kk-mer counting tools for large genome data sets.Comment: A short version of this paper will appear in the proceedings of WABI 201

    Timing of Train Disposition: Towards Early Passenger Rerouting in Case of Delays

    Get PDF
    Passenger-friendly train disposition is a challenging, highly complex online optimization problem with uncertain and incomplete information about future delays. In this paper we focus on the timing within the disposition process. We introduce three different classification schemes to predict as early as possible the status of a transfer: whether it will almost surely break, is so critically delayed that it requires manual disposition, or can be regarded as only slightly uncertain or as being safe. The three approaches use lower bounds on travel times, historical distributions of delay data, and fuzzy logic, respectively. In experiments with real delay data we achieve an excellent classification rate. Furthermore, using realistic passenger flows we observe that there is a significant potential to reduce the passenger delay if an early rerouting strategy is applied

    Increased bioavailability of phenolic acids and enhanced vascular function following intake of feruloyl esterase-processed high fibre bread: a randomized, controlled, single blind, crossover human intervention trial

    Get PDF
    Background & aims Clinical trial data have indicated an association between wholegrain consumption and a reduction in surrogate markers of cardiovascular disease. Phenolics present in wholegrain bound to arabinoxylan fibre may contribute these effects, particularly when released enzymatically from the fiber prior to ingestion. The aim of the present study was therefore to determine whether the intake of high fibre bread containing higher free ferulic acid (FA) levels (enzymatically released during processing) enhances human endothelium-dependent vascular function. Methods A randomized, single masked, controlled, crossover, human intervention study was conducted on 19 healthy men. Individuals consumed either a high fibre flatbread with enzymatically released free FA (14.22 mg), an equivalent standard high fibre bread (2.34 mg), or a white bread control (0.48 mg) and markers of vascular function and plasma phenolic acid concentrations were measured at baseline, 2, 5 and 7 h post consumption. Results Significantly increased brachial arterial dilation was observed following consumption of the high free FA (‘enzyme-treated’) high fibre bread verses both a white bread (2 h: p 0.05). Conclusion Dietary intake of bread, processed enzymatically to release FA from arabinoxylan fiber during production increases the bioavailability of FA, and induces acute endothelium-dependent vasodilation. Clinical trial registry: No NCT03946293. Website www.clinicaltrials.gov

    The quality of the upper bounds for rapidly mixing instances.

    No full text
    <p>The results of <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0147935#pone.0147935.g003" target="_blank">Fig 3</a> filtered to highlight instances with known polynomial mixing time. Instances with no known polynomial bound are coloured gray.</p

    Influence of the average vertex degree.

    No full text
    <p>Connection between average vertex degree of a state graph and its total mixing time, respectively canonical path bound.</p

    Single and double precision performance of the total mixing time computation.

    No full text
    <p>The charts show the running time for the computation of the total mixing time on the example of five state graphs of size 8012 to 20358. Due to the relatively small amount of GPU memory on our test system, only the first four (respectively two) state graphs could be processed by the GPU implementation in single precision mode (respectively double precision mode). The running times were measured on an Ubuntu 14.04 system with a Intel Xeon E3-1231, NVIDIA GeForce GTX 970 (4 GB GPU memory) and 16 GB of main memory, using <i>gcc</i> in version 4.8.4 and <i>CUDA</i> in version 7.0.</p

    Relationship between the lower spectral bound and the total mixing time.

    No full text
    <p>The total mixing time is shown in connection to a corresponding lower spectral bound for sequence pairs of the form (<i>n</i> − 1, <i>n</i> − 2, 2, 1), (2, 2, …, 2). We use the displayed formulas to predict missing values for total mixing time.</p
    corecore